Predicting Emotion Labels for Chinese Microblog Texts
نویسندگان
چکیده
We describe an experiment into detecting emotions in texts on the Chinese microblog service Sina Weibo (www.weibo.com) using distant supervision via various author-supplied emotion labels (emoticons and smilies). Existing word segmentation tools proved unreliable; better accuracy was achieved using characterbased features. Higher-order n-grams proved to be useful features. Accuracy varied according to label and emotion: while smilies are used more often, emoticons are more reliable. Happiness is the most accurately predicted emotion, with accuracies around 90% on both distant and gold-standard labels. This approach works well and achieves high accuracies for happiness and anger, while it is less effective for sadness, surprise, disgust and fear, which are also difficult for human annotators to detect.
منابع مشابه
Emotion Classification in Microblog Texts Using Class Sequential Rules
This paper studies the problem of emotion classification in microblog texts. Given a microblog text which consists of several sentences, we classify its emotion as anger, disgust, fear, happiness, like, sadness or surprise if available. Existing methods can be categorized as lexicon based methods or machine learning based methods. However, due to some intrinsic characteristics of the microblog ...
متن کاملEmotion Corpus Construction Based on Selection from Hashtags
The availability of labelled corpus is of great importance for supervised learning in emotion classification tasks. Because it is time-consuming to manually label text, hashtags have been used as naturally annotated labels to obtain a large amount of labelled training data from microblog. However, natural hashtags contain too much noise for it to be used directly in learning algorithms. In this...
متن کاملMicroblog Emotion Classification by Computing Similarity in Text, Time, and Space
Most work in NLP analysing microblogs focuses on textual content thus neglecting temporal and spatial information. We present a new interdisciplinary method for emotion classification that combines linguistic, temporal, and spatial information into a single metric. We create a graph of labeled and unlabeled tweets that encodes the relations between neighboring tweets with respect to their emoti...
متن کاملTowards Scalable Emotion Classification in Microblog Based on Noisy Training Data
The availability of labeled corpus is of great importance for emotion classification tasks. Because manual labeling is too timeconsuming, hashtags have been used as naturally annotated labels to obtain large amount of labeled training data from microblog. However, the inconsistency and noise in annotation can adversely affect the data quality and thus the performance when used to train a classi...
متن کاملUser Embedding for Scholarly Microblog Recommendation
Nowadays, many scholarly messages are posted on Chinese microblogs and more and more researchers tend to find scholarly information on microblogs. In order to exploit microblogging to benefit scientific research, we propose a scholarly microblog recommendation system in this study. It automatically collects and mines scholarly information from Chinese microblogs, and makes personalized recommen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015